Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

status: Remove activeRevision label and generate one dedicated metric with it #4600

Merged
merged 9 commits into from
Apr 26, 2022

Conversation

costimuraru
Copy link
Contributor

Fixes #4584

Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 thanks for picking this up!

A couple of small things we need to get into place here:

  1. DCO sign-off, please see here
  2. Can you adapt these tests, please? https://github.com/open-policy-agent/opa/blob/main/plugins/status/plugin_go1.17_test.go#L23
  3. Something with the formatting, the linter complains. It's easiest to run gofmt -s on the changed files, I think; the linter output is unfortunately quite hard to read.

Thanks again!! 🎉

docs/content/monitoring.md Outdated Show resolved Hide resolved
cmuraru added 2 commits April 20, 2022 15:30
… with it

Signed-off-by: cmuraru <cmuraru@adobe.com>
Signed-off-by: cmuraru <cmuraru@adobe.com>
@costimuraru costimuraru force-pushed the fix-metrics-cardinality branch from 90f652e to 19836b2 Compare April 20, 2022 12:30
Signed-off-by: cmuraru <cmuraru@adobe.com>
@costimuraru
Copy link
Contributor Author

Thanks for the feedback, @srenatus. Will update the test shortly.

@@ -47,6 +48,10 @@ func (s *Status) SetActivateSuccess(revision string) {
// download.
func (s *Status) SetDownloadSuccess() {
s.LastSuccessfulDownload = time.Now().UTC()

if s.FirstSuccessfulActivation.IsZero() {
s.FirstSuccessfulActivation = s.LastSuccessfulDownload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we setting the FirstSuccessfulActivation to last LastSuccessfulDownload. A successful download does not imply a successful activation. I could imagine setting this value in the SetActivateSuccess method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @ashutosh-narkar. I missed that we already keep the last success bundle activation.
In light of this, I changed the code a bit:

  1. we keep the activeRevision label (just on) the last_success_bundle_activation metric.
  2. the gauge gets reset, so we only keep the last active_revision (instead of keeping them all and therefore avoiding the situation where the /metrics output grows indefinitely)

What do you think, @rafaelreinert?

# HELP last_success_bundle_activation Gauge for the last success bundle activation.
# TYPE last_success_bundle_activation gauge
last_success_bundle_activation{active_revision="1650490742504",name="authz"} 1.650490748253708e+18
# HELP last_bundle_request Gauge for the last bundle request.
# TYPE last_bundle_request gauge
last_bundle_request{name="authz"} 1.650490762751247e+18
# HELP last_success_bundle_download Gauge for the last success bundle download.
# TYPE last_success_bundle_download gauge
last_success_bundle_download{name="authz"} 1.6504907480146977e+18
# HELP last_success_bundle_request Gauge for the last success bundle request.
# TYPE last_success_bundle_request gauge
last_success_bundle_request{name="authz"} 1.650490762751247e+18
# HELP plugin_status_gauge Gauge for the plugin by status.
# TYPE plugin_status_gauge gauge
plugin_status_gauge{name="bundle",status="OK"} 1
plugin_status_gauge{name="decision_logs",status="OK"} 1
plugin_status_gauge{name="discovery",status="OK"} 1
plugin_status_gauge{name="status",status="OK"} 1

I also managed to use the last_success_bundle_activation to add an annotation in Grafana, which contains the active_revision label. In this way, we are able to see when (and which) new bundle is activated in OPA and correlate with the other metrics.

Screenshot 2022-04-21 at 00 48 25

Screenshot 2022-04-21 at 00 41 00

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, solves the cardinality issue and keeps the active_revision of the last bundle. 😄

cmuraru added 2 commits April 21, 2022 00:11
Signed-off-by: cmuraru <cmuraru@adobe.com>
Signed-off-by: cmuraru <cmuraru@adobe.com>
@costimuraru costimuraru force-pushed the fix-metrics-cardinality branch from 1bacce9 to 93703d0 Compare April 20, 2022 21:43
Signed-off-by: cmuraru <cmuraru@adobe.com>
@costimuraru costimuraru force-pushed the fix-metrics-cardinality branch from 92a5be0 to bc83360 Compare April 20, 2022 21:53
Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot 👏

@srenatus srenatus merged commit d819c1e into open-policy-agent:main Apr 26, 2022
rokkiter pushed a commit to rokkiter/opa that referenced this pull request Apr 26, 2022
…y-agent#4600)

Having one activeRevision label on each of the prometheus metrics emitted
by the status plugin has proven to be problematic with a large number of
bundles. So with this change,

1. we keep the activeRevision label (just on) the last_success_bundle_activation metric.
2. the gauge gets reset, so we only keep the last active_revision (instead of keeping
   them all and therefore avoiding the situation where the /metrics output grows indefinitely)

Fixes open-policy-agent#4584.

Signed-off-by: cmuraru <cmuraru@adobe.com>
damienjburks added a commit to damienjburks/opa that referenced this pull request May 18, 2022
# This is the 1st commit message:

finalizing changes for formatting with sprintf

Signed-off-by: Damien Burks <damien@damienjburks.com>

# This is the commit message open-policy-agent#2:

updating changes to allow for multiple format strings

Signed-off-by: Damien Burks <damien@damienjburks.com>

# This is the commit message open-policy-agent#3:

fixing golint issues

Signed-off-by: Damien Burks <damien@damienjburks.com>

# This is the commit message open-policy-agent#4:

fixing golint issues

Signed-off-by: Damien Burks <damien@damienjburks.com>

# This is the commit message open-policy-agent#5:

making recommended change: package level variable

Signed-off-by: Damien Burks <damien@damienjburks.com>

# This is the commit message open-policy-agent#6:

adding support for explicit argument indexes

Signed-off-by: Damien Burks <damien@damienjburks.com>

# This is the commit message open-policy-agent#7:

format: don't add 'in' keyword import when 'every' is there (open-policy-agent#4607)

Also ensure that added imports have a location set.

Previously, `opa fmt` on the added test file would have panicked
because the import hadn't had a location.

Fixes open-policy-agent#4606.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#8:

ast+topdown+planner: allow for mocking built-in functions via "with" (open-policy-agent#4540)

With this change, we can replace calls to built-in functions via `with`. The replacement
can either be a value -- which will be used as the return value for every call to the
mocked built-in -- or a reference to a non-built-in function -- when the results need
to depend on the call's arguments.

Compiler, topdown, and planner have been adapted in this change. The included
docs changes describe the replacement options further.

Fixes first part of open-policy-agent#4449. (Missing are non-built-in functions as mock targets.)

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#9:

build(deps): bump google.golang.org/grpc from 1.45.0 to 1.46.0 (open-policy-agent#4617)


# This is the commit message open-policy-agent#10:

docs/policy-testing: use assignment operator in mocks (open-policy-agent#4618)

Additionally, simplify one test example.

Signed-off-by: Anders Eknert <anders@eknert.com>
# This is the commit message open-policy-agent#11:

cmd/capabilities: expose capabilities through CLI (open-policy-agent#4588)

There is a new command argument "capabilities". With this, it is
possible to print the current capabilities version, show all
capabilities versions & print any capabilities version, without the need
of a file. Moreover, for the other commands which use the --capabilities
flag, it is possible to give only the version number, without specifying
a file. However, there are no breaking changes for those who use the
capabilities file as an input for the flag. Unit tests were also
written, in order to test the new argument and the changes made in ast.

Fixes: open-policy-agent#4236

Signed-off-by: IoannisMatzaris <matzarisioannis@gmail.com>
# This is the commit message open-policy-agent#12:

format,eval: don't use source locations when formatting PE output (open-policy-agent#4611)

* format: allow ignoreing source locations
* cmd/eval: format disregarding source locations for partial result

Before, we'd see this output:
```
$ opa eval -p -fsource 'time.clock(input.x)==time.clock(input.y)'
# Query 1
time.clock(time.clock(input.x), input.y)
```

Now, we get the proper answer: `time.clock(input.y, time.clock(input.x))`.

Note that it's a _display_ issue; the JSON output of PE has not been affected.

Fixes open-policy-agent#4609.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#13:

build(deps): bump github/codeql-action from 1 to 2 (open-policy-agent#4621)

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 1 to 2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@v1...v2)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# This is the commit message open-policy-agent#14:

status: Remove activeRevision label on all but one metric (open-policy-agent#4600)

Having one activeRevision label on each of the prometheus metrics emitted
by the status plugin has proven to be problematic with a large number of
bundles. So with this change,

1. we keep the activeRevision label (just on) the last_success_bundle_activation metric.
2. the gauge gets reset, so we only keep the last active_revision (instead of keeping
   them all and therefore avoiding the situation where the /metrics output grows indefinitely)

Fixes open-policy-agent#4584.

Signed-off-by: cmuraru <cmuraru@adobe.com>
# This is the commit message open-policy-agent#15:

website: add playground button to navbar (open-policy-agent#4622)

Addressing one tiny bit of open-policy-agent#4614.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#16:

topdown/net: require prefix length for IPv6 in net.cidr_merge (open-policy-agent#4613)

There are no default prefixes in IPv6, so if an IPv6 without a prefix is fed into
net.cidr_merge, we'll return a non-halt error now.

Before, we'd fail in various ways if a prefix-less IPv6 was fed into
`net.cidr_merge`. With only one, we'd return `[ "<nil>" ]`, with two,
we'd panic.

Fixes open-policy-agent#4596.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#17:

Dockerfile: add source annotation (open-policy-agent#4626)

`org.opencontainers.image.source` URL to get source code for building the image (string)

https://github.com/opencontainers/image-spec/blob/main/annotations.md

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#18:

build(deps): bump github.com/fsnotify/fsnotify v1.5.2 -> v1.5.4 (open-policy-agent#4628)

https://github.com/fsnotify/fsnotify/releases/tag/v1.5.4

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#19:

docs: update version in kubernetes examples (open-policy-agent#4627)

Signed-off-by: yongen.pan <yongen.pan@daocloud.io>
# This is the commit message open-policy-agent#20:

bundle/status: Include bundle type in status information

OPA has support for Delta Bundles. The status object already
contains valuable information such as last activation timestamp but
does not specify if the bundle was a canonical snapshot or delta.

This change updates the bundle.Status object to include the
bundle type string: either "snapshot" or "delta". This can be useful
for status endpoints to differentiate between the bundle types.

Issue: 4477

Signed-off-by: Bryan Fulton <bryan@styra.com>

# This is the commit message open-policy-agent#21:

ast+topdown+planner: replacement of non-built-in functions via 'with' (open-policy-agent#4616)

Follow-up to open-policy-agent#4540

We can now mock functions that are user-defined:

    package test

    f(_) = 1 {
        input.x = "x"
    }
    p = y {
        y := f(1) with f as 2
    }

...following the same scoping rules as laid out for built-in mocks.
The replacement can be a value (replacing all calls), or a built-in,
or another non-built-in function.

Also addresses bugs in the previous slice:
* topdown/evalCall: account for empty rules result from indexer
* topdown/eval: capture value replacement in PE could panic

Note: in PE, we now drop 'with' for function mocks of any kind:

These are always fully replaced in the saved support modules, so
this should be OK.

When keeping them, we'd also have to either copy the existing definitions
into the support module; or create a function stub in it.

Fixes open-policy-agent#4449.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#22:

format: keep whitespaces for multiple indented same-line withs (open-policy-agent#4635)

Fixes open-policy-agent#4634.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#23:

downloader: support for downloading bundles from an OCI registry (open-policy-agent#4558)

Initial support for open-policy-agent#4518.

Configuration uses the 'services' config for registries, via the "type: oci" field.
Bundles configured to pull from that service will then use OCI.

```
services:
  ghcr-registry:
    url: https://ghcr.io
    type: oci
bundles:
  authz:
    service: ghcr-registry
    resource: ghcr.io/${ORGANIZATION}/${REPOSITORY}:${TAG}
    persist: true
    polling:
      min_delay_seconds: 60
      max_delay_seconds: 120
persistence_directory: ${PERSISTENCE_PATH}
```

Service credentials are supported: if you want to pull from a private registry,
use
```
services:
  ghcr-registry:
    url: https://ghcr.io
    type: oci
    credentials:
      bearer:
        token: ${GH_PAT}
```

If no `persistence_directory` is configured, the data is stored in a directory under /tmp.

See docs/devel/OCI.md for manual steps to test this feature with some
OCI registry (like ghcr.io).

Signed-off-by: carabasdaniel <dani@aserto.com>
# This is the commit message open-policy-agent#24:

Prepare v0.40.0 Release (open-policy-agent#4631)

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#25:

Prepare v0.41.0 development (open-policy-agent#4636)

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#26:

docs: Adding example for `rego.metadata.role()` usage (open-policy-agent#4640)

Signed-off-by: Johan Fylling <johan.dev@fylling.se>
# This is the commit message open-policy-agent#27:

build(deps): bump oras.land/oras-go from 1.1.0 to 1.1.1 (open-policy-agent#4643)

Bumps [oras.land/oras-go](https://github.com/oras-project/oras-go) from 1.1.0 to 1.1.1.
- [Release notes](https://github.com/oras-project/oras-go/releases)
- [Commits](oras-project/oras-go@v1.1.0...v1.1.1)

---
updated-dependencies:
- dependency-name: oras.land/oras-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# This is the commit message open-policy-agent#28:

build(deps): bump OpenTelemetry 1.6.3 -> 1.7.0 (open-policy-agent#4649)

https://github.com/open-telemetry/opentelemetry-go/releases/tag/v1.7.0
https://github.com/open-telemetry/opentelemetry-go-contrib/releases/tag/v1.7.0

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#29:

build(deps): bump github.com/containerd/containerd from 1.6.2 to 1.6.3 (open-policy-agent#4654)

Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.6.2 to 1.6.3.
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](containerd/containerd@v1.6.2...v1.6.3)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# This is the commit message open-policy-agent#30:

Update k8s examples to the latest schema (open-policy-agent#4655)

Signed-off-by: Víctor Martínez Bevià <vicmarbev@gmail.com>
# This is the commit message open-policy-agent#31:

Fix incorrect padding claims (open-policy-agent#4657)

Signed-off-by: Anders Eknert <anders@eknert.com>
# This is the commit message open-policy-agent#32:

build(deps): bump github.com/containerd/containerd from 1.6.3 to 1.6.4 (open-policy-agent#4662)

Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.6.3 to 1.6.4.
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](containerd/containerd@v1.6.3...v1.6.4)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# This is the commit message open-policy-agent#33:

build(deps): bump docker/setup-qemu-action from 1 to 2 (open-policy-agent#4668)


# This is the commit message open-policy-agent#34:

build(deps): bump docker/setup-buildx-action from 1 to 2 (open-policy-agent#4669)

Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 1 to 2.
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](docker/setup-buildx-action@v1...v2)

---
updated-dependencies:
- dependency-name: docker/setup-buildx-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# This is the commit message open-policy-agent#35:

build(deps): github.com/bytecodealliance/wasmtime-go 0.35.0 -> 0.36.0 (open-policy-agent#4652)

* build(deps): bump wasmtime-go: 0.35.0 -> 0.36.0
* internal/wasm: adapt to using epoch-based interruption

Looks like we don't get frames for this.

Also, there is currentlty no better way than comparing the message,
as the trap code isn't surfaced (yet).

Fixes open-policy-agent#4663.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
# This is the commit message open-policy-agent#36:

ecosystem: Add Sansshell (open-policy-agent#4674)


Signed-off-by: James Chacon <james.chacon@snowflake.com>
# This is the commit message open-policy-agent#37:

topdown: Add units.parse builtin (open-policy-agent#4676)

This function works on all base decimal and binary SI units of the set:

    m, K/Ki, M/Mi, G/Gi, T/Ti, P/Pi, and E/Ei

Note: Unlike `units.parse_bytes`, this function is case sensitive.

Fixes open-policy-agent#1802.

Signed-off-by: Philip Conrad <philipaconrad@gmail.com>
# This is the commit message open-policy-agent#38:

docs/contrib-code: Add capabilities step to built-in functions tutorial (open-policy-agent#4677)

Signed-off-by: Philip Conrad <philipaconrad@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OPA generates a lot of bundle metrics and floods system
4 participants